V2 #15

Nayjest · 2025-10-24T22:11:33Z

Major refactoring
Added component support
The config module no longer instantiates logger instances — this is now handled in the bootstrap process
LogEntry has been renamed to RequestContext and now includes additional fields such as model, user_info, and extra
config.check_api_key renamed to config.api_key_check
Added config loaders for TOML, YAML, JSON, and Python; custom config loaders can be registered via the config.loaders entry point
Reworked bootstrap messaging
Environment variable substitution is now handled via utils.replace_env_strings_recursive
lm_proxy.loggers.log_writers.JsonLogWriter moved to lm_proxy.loggers.JsonLogWriter
API key check functions can now return additional data as a tuple (interpreted as user_info)

- Major refactoring - Components support - Config now do not instantiate logger instances, that's done in bootstrap - LogEntry is now RequestContext and includes additional fields like 'model', 'user_info', 'extra' - config.check_api_key renamed to config.api_key_check - Config loaders: toml, yaml, json, py; Custom config loaders may be registered via "config.loaders" entry point - Bootstrap messaging reworked - Substituting env:vars now is done via utils.replace_env_strings_recursive - lm_proxy.loggers.log_writers.JsonLogWriter moved to lm_proxy.loggers.JsonLogWriter - API check functions now can return additional data in a tuple (treated as user_info)

github-actions · 2025-10-26T17:38:18Z

I've Reviewed the Code

This release delivers a significant architectural upgrade: pluggable API-key validation (config/HTTP), a unified RequestContext with non-blocking, transformable logging, multi-format config loading via entry points, improved bootstrap/CLI/debug controls and docs, renamed models endpoint, added tests, and general refinements (e.g., requests dependency, coverage badge).

⚠️ 11 issues found across 33 files

`#1` Broken Table of Contents anchor for Advanced Usage

README.md L38

The TOC link for Advanced Usage uses a percent-encoded variation selector ("#%EF%B8%8F-advanced-usage"), which does not match GitHub's generated anchor for the heading "## 🛠️ Advanced Usage". This results in a broken link. The correct anchor should be "#-advanced-usage".
Tags: bug, readability
Affected code:

38: - [Advanced Usage](#%EF%B8%8F-advanced-usage)

Proposed change:

  - [Advanced Usage](#-advanced-usage)

`#2` Low contrast text on yellow background reduces readability

coverage.svg L19

The right badge segment color was changed to #dfb317 (bright yellow) but the percentage text remains white (#fff), resulting in poor contrast and reduced readability/accessibility. Switching the percentage text color to black ensures sufficient contrast.
Tags: readability
Affected code:

19:         <text x="80" y="14">65%</text>

Proposed change:

        <text x="80" y="14" fill="#000">65%</text>

`#3` Uncaught JSON parsing/type errors can raise exceptions and bypass error handling

lm_proxy/api_key_check/with_request.py L41-L43

When response_as_user_info is True, response.json() is called without guarding against JSON decoding errors, and the code assumes the result is a dict. Invalid or non-dict JSON will raise ValueError or cause AttributeError on .get(), which is not caught by the current except (it only catches requests.exceptions.RequestException). This can crash the check instead of returning None like other error paths.
Tags: bug, maintainability
Affected code:

41:                     user_info = response.json()
42:                     if self.group_field:
43:                         group = user_info.get(self.group_field, self.default_group)

Proposed change:

                    try:
                        user_info = response.json()
                    except ValueError:
                        return None
                    if self.group_field:
                        if isinstance(user_info, dict):
                            group = user_info.get(self.group_field, self.default_group)
                        else:
                            group = self.default_group

`#4` Incorrect class path in ImportError message

lm_proxy/api_key_check/with_request.py L54

The error message references 'lm_proxy.api_key_check.CheckAPIKeyWithRequest', but the actual class resides in 'lm_proxy.api_key_check.with_request'. This typo can mislead users when installing optional dependencies.
Tags: bug, maintainability, code-style
Affected code:

54:                     "Using 'lm_proxy.api_key_check.CheckAPIKeyWithRequest' with 'use_cache = true' "

Proposed change:

                    "Using 'lm_proxy.api_key_check.with_request.CheckAPIKeyWithRequest' with 'use_cache = true' "

`#5` Function may return None or non-dict despite annotation and docstring

lm_proxy/config_loaders/yaml.py L16-L17

yaml.safe_load can return None for empty files or a non-dict type (e.g., list, scalar) if the YAML root is not a mapping. This contradicts the function's return type and docstring, and can cause downstream runtime errors. Enforce that the parsed content is a dictionary and raise a clear error otherwise.
Tags: bug, maintainability
Affected code:

16:         return yaml.safe_load(f)

Proposed change:

    with open(config_path, "r", encoding="utf-8") as f:
        data = yaml.safe_load(f)
    if not isinstance(data, dict):
        raise ValueError(
            f"Invalid YAML config at '{config_path}': expected a mapping (dictionary) at the document root."
        )
    return data

`#6` Awaiting log_non_blocking causes TypeError when no loggers and defeats non-blocking intent

lm_proxy/core.py L123,L263,L265,L295-L300

log_non_blocking is awaited in multiple places, but it returns None when no loggers are configured, which will raise a TypeError when awaited. Even when loggers exist, awaiting the returned Task blocks until logging completes, violating the non-blocking intent.
Tags: bug, anti-pattern, maintainability
Affected code:

123:     await log_non_blocking(log_entry)

Proposed change:

    log_non_blocking(log_entry)

Affected code:

263:         await log_non_blocking(log_entry)

Proposed change:

        log_non_blocking(log_entry)

Affected code:

265:     await log_non_blocking(log_entry)

Proposed change:

    log_non_blocking(log_entry)

Affected code:

295: async def log_non_blocking(
296:     log_entry: RequestContext,
297: ) -> Optional[asyncio.Task]:
298:     if env.loggers:
299:         task = asyncio.create_task(log(log_entry))
300:         return task

Proposed change:

def log_non_blocking(
    log_entry: RequestContext,
) -> Optional[asyncio.Task]:
    if env.loggers:
        task = asyncio.create_task(log(log_entry))
        return task
    return None

`#7` Forwarded header parsing can raise IndexError and crash request handling

lm_proxy/utils.py L77-L79

get_client_ip assumes the Forwarded header always contains 'for=' and uses split()[1], which raises IndexError when the header is present but lacks 'for='. This can crash request handling. Parsing should be done safely and case-insensitively, extracting the first for= value if present and otherwise falling back.
Tags: bug, maintainability
Affected code:

77:     if forwarded := request.headers.get("Forwarded"):
78:         # Parse Forwarded header (RFC 7239)
79:         return forwarded.split("for=")[1].split(";")[0].strip()

Proposed change:

    if forwarded := request.headers.get("Forwarded"):
        # Parse Forwarded header (RFC 7239) safely
        # Example: Forwarded: for=192.0.2.60;proto=http;by=203.0.113.43
        try:
            for part in forwarded.split(","):
                for attr in part.split(";"):
                    attr = attr.strip()
                    if attr.lower().startswith("for="):
                        value = attr.split("=", 1)[1].strip().strip('"')
                        return value
        except Exception:
            pass

`#8` Invalid 'markers' format for colorama package in poetry.lock

poetry.lock L315

In the [[package]] entry for 'colorama', the 'markers' field is defined as a TOML table ({test = "sys_platform == "win32""}) instead of a PEP 508 marker string. Poetry's lock file expects 'markers' to be a string expression (e.g., "sys_platform == 'win32'"). This malformed structure can cause Poetry to fail parsing the lock file or ignore the marker, leading to incorrect dependency resolution/installation. Given the file is auto-generated, this strongly suggests an unintended manual change. Regenerate the lock with Poetry or correct the 'markers' field to a string.
Tags: bug, maintainability, compatibility
Affected code:

315: markers = {test = "sys_platform == \"win32\""}

Proposed change:

markers = "sys_platform == \"win32\""

`#9` Incorrect project root path added to sys.path (off-by-one parent index)

tests/configs/config_fn.py L5

The code appends the parent directory one level above the repository root to sys.path using parents[3]. Given this file lives at tests/configs/config_fn.py, the repository root is parents[2]. Using parents[3] can break local imports (e.g., lm_proxy) when the package is not installed, causing ModuleNotFoundError.
Tags: bug
Affected code:

5: root = Path(__file__).resolve().parents[3]

Proposed change:

root = Path(__file__).resolve().parents[2]

`#10` Test mutates global config without restoring it, causing cross-test side effects

tests/test_models_endpoint.py L53-L57

The test directly modifies env.config.model_listing_mode and does not restore the previous value. This introduces global state leakage that can make other tests order-dependent or flaky.
Tags: bug, maintainability
Affected code:

53:     env.config.model_listing_mode = ModelListingMode.IGNORE_WILDCARDS
54:     payload = json.loads((await models(req)).body.decode())
55:     assert len(payload["data"]) == 2  # Only 'a' and 'b'
56:     assert payload["data"][0]["id"] in ("a", "b")
57:     assert payload["data"][1]["id"] in ("a", "b")

Proposed change:

    prev_mode = env.config.model_listing_mode
    try:
        env.config.model_listing_mode = ModelListingMode.IGNORE_WILDCARDS
        payload = json.loads((await models(req)).body.decode())
        assert len(payload["data"]) == 2  # Only 'a' and 'b'
        assert payload["data"][0]["id"] in ("a", "b")
        assert payload["data"][1]["id"] in ("a", "b")
    finally:
        env.config.model_listing_mode = prev_mode

`#11` Test leaks environment variables and uses brittle logging assertion

tests/test_utils.py L46-L54

The test modifies process-wide environment variables via os.environ and doesn't restore them, which can pollute other tests. It also asserts an exact count of log records, making the test brittle. Use pytest's monkeypatch to isolate env changes and caplog.clear() with a predicate-based assertion to verify a warning was logged.
Tags: bug, maintainability
Affected code:

46: def test_replace_env_strings_recursive(caplog):
47:     os.environ['TEST_VAR1'] = 'env_value1'
48:     os.environ['TEST_VAR2'] = 'env_value2'
49:     assert replace_env_strings_recursive("env:TEST_VAR1") == 'env_value1'
50: 
51:     caplog.set_level(logging.WARNING)
52:     assert replace_env_strings_recursive("env:NON_EXIST") == ''
53:     assert len(caplog.records) == 1
54:

Proposed change:

def test_replace_env_strings_recursive(caplog, monkeypatch):
    monkeypatch.setenv('TEST_VAR1', 'env_value1')
    assert replace_env_strings_recursive("env:TEST_VAR1") == 'env_value1'

    caplog.set_level(logging.WARNING)
    caplog.clear()
    assert replace_env_strings_recursive("env:NON_EXIST") == ''
    assert any(record.levelno >= logging.WARNING for record in caplog.records)

Nayjest added 6 commits October 23, 2025 13:08

upd. gitignore, ignore coverage

aab65a9

pyproject: add testpaths

936c8dd

README.md: Debugging section added

cf1b97a

cs fixes

2017e85

upd. poetry.lock

8e5eca8